Skip to main content
Log in

A tutorial on adaptive MCMC

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We review adaptive Markov chain Monte Carlo algorithms (MCMC) as a mean to optimise their performance. Using simple toy examples we review their theoretical underpinnings, and in particular show why adaptive MCMC algorithms might fail when some fundamental properties are not satisfied. This leads to guidelines concerning the design of correct algorithms. We then review criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria, but also analyse the properties of adaptive MCMC algorithms. We then propose a series of novel adaptive algorithms which prove to be robust and reliable in practice. These algorithms are applied to artificial and high dimensional scenarios, but also to the classic mine disaster dataset inference problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ahn, J.-H., Oh, J.-H.: A constrained EM algorithm for principal component analysis. Neural Comput. 15, 57–65 (2003)

    Article  MATH  Google Scholar 

  • Andradóttir, S.: A stochastic approximation algorithm with varying bounds. Oper. Res. 43(6), 1037–1048 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Andrieu, C.: Discussion of Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing (December 2003). J. R. Stat. Soc. Ser. B 66(3), 497–813 (2004)

    Google Scholar 

  • Andrieu, C., Atchadé, Y.F.: On the efficiency of adaptive MCMC algorithms. Electron. Commun. Probab. 12, 336–349 (2007)

    MATH  Google Scholar 

  • Andrieu, C., Doucet, A.: Discussion of Brooks, S.P., Giudici, P., Roberts, G.O.: Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions. Part 1. J. R. Stat. Soc. B 65, 3–55 (2003)

  • Andrieu, C., Jasra, A.: Efficient and principled implementation of the tempering procedure. Tech. Rep. University of Bristol (2008)

  • Andrieu, C., Moffa, G.: A Gaussian copula approach for adaptation in discrete scenarios (2008, in preparation)

  • Andrieu, C., Moulines, É.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Andrieu, C., Robert, C.P.: Controlled MCMC for optimal sampling. Tech. Rep. 0125, Cahiers de Mathématiques du Ceremade, Université Paris-Dauphine (2001)

  • Andrieu, C., Tadić, V.B.: The boundedness issue for controlled MCMC algorithms. Tech. Rep. University of Bristol (2007)

  • Andrieu, C., Moulines, É., Priouret, P.: Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44(1), 283–312 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Atchadé, Y.F.: An adaptive version for the Metropolis adjusted Langevin algorithm with a truncated drift. Methodol. Comput. Appl. Probab. 8, 235–254 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Atchadé, Y.F., Fort, G.: Limit Theorems for some adaptive MCMC algorithms with subgeometric kernels. Tech. Rep. (2008)

  • Atchadé, Y.F., Liu, J.S.: The Wang-Landau algorithm in general state spaces: applications and convergence analysis. Technical report Univ. of Michigan (2004)

  • Atchadé, Y.F., Rosenthal, J.S.: On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815–828 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Bai, Y., Roberts, G.O., Rosenthal, J.S.: On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms. Tech. Rep. University of Toronto (2008)

  • Bédard, M.: Optimal acceptance rates for metropolis algorithms: moving beyond 0.234. Tech. Rep. University of Montréal (2006)

  • Bédard, M.: Weak convergence of metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17, 1222–1244 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Bennet, J.E., Racine-Poon, A., Wakefield, J.C.: MCMC for nonlinear hierarchical models. In: MCMC in Practice. Chapman & Hall, London (1996)

    Google Scholar 

  • Benveniste, A., Métivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer, Berlin (1990)

    MATH  Google Scholar 

  • Besag, J., Green, P.J.: Spatial statistics and Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 55, 25–37 (1993)

    MATH  MathSciNet  Google Scholar 

  • Borkar, V.S.: Topics in Controlled Markov Chains. Longman, Harlow (1990)

    Google Scholar 

  • Browne, W.J., Draper, D.: Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Comput. Stat. 15, 391–420 (2000)

    Article  MATH  Google Scholar 

  • Cappé, O., Douc, R., Gullin, A., Marin, J.-M., Robert, C.P.: Adaptive Importance Sampling in General Mixture Classes. Preprint (2007)

  • Ceperley, D., Chester, G.V., Kalos, M.H.: Monte Carlo simulation of a many fermion study. Phys. Rev. B 16(7), 3081–3099 (1977)

    Article  Google Scholar 

  • Chauveau, D., Vandekerkhove, P.: Improving convergence of the Hastings-Metropolis algorithm with an adaptive proposal. Scand. J. Statist. 29(1), 13–29 (2001)

    Article  MathSciNet  Google Scholar 

  • Chen, H.F., Guo, L., Gao, A.J.: Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stoch. Process. Their Appl. 27(2), 217–231 (1988)

    MATH  Google Scholar 

  • Chib, S., Greenberg, E., Winkelmann, R.: Posterior simulation and Bayes factors in panel count data models. J. Econ. 86, 33–54 (1998)

    MATH  Google Scholar 

  • de Freitas, N., Højen-Sørensen, P., Jordan, M., Russell, S.: Variational MCMC. In: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 120–127. Morgan Kaufman, San Mateo (2001). ISBN:1-55860-800-1

    Google Scholar 

  • Delmas, J.-F., Jourdain, B.: Does waste-recycling really improve Metropolis-Hastings Monte Carlo algorithm? Tech. Rep. Cermics, ENPC (2007)

  • Delyon, B.: General results on the convergence of stochastic algorithms. IEEE Trans. Automat. Control 41(9), 1245–1256 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  • Delyon, B., Juditsky, A.: Accelerated stochastic approximation. SIAM J. Optim. 3(4), 868–881 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  • Douglas, C.: Simple adaptive algorithms for cholesky, LDL T, QR, and eigenvalue decompositions of autocorrelation matrices for sensor array data. In: Signals, Systems and Computers, 2001, Conference Record of the Thirty-Fifth Asilomar Conference, vol. 21, pp. 1134–1138 (2001)

  • Erland, S.: On Adaptivity and Eigen-Decompositions of Markov Chains. Ph.D. thesis Norwegian University of Science and Technology (2003)

  • Frenkel, D.: Waste-recycling Monte Carlo. In: Computer Simulations In Condensed Matter: from Materials to Chemical Biology. Lecture Notes in Physics, vol. 703, pp. 127–138. Springer, Berlin (2006)

    Chapter  Google Scholar 

  • Gåsemyr, J.: On an adaptive Metropolis-Hastings algorithm with independent proposal distribution. Scand. J. Stat. 30(1), 159–173 (2003). ISSN 0303-6898

    Article  Google Scholar 

  • Gåsemyr, J., Natvig, B., Nygård, C.S.: An application of adaptive independent chain Metropolis–Hastings algorithms in Bayesian hazard rate estimation. Methodol. Comput. Appl. Probab. 6(3), 293–302(10) (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Gelfand, A.E., Sahu, S.K.: On Markov chain Monte Carlo acceleration. J. Comput. Graph. Stat. 3(3), 261–276 (1994)

    Article  MathSciNet  Google Scholar 

  • Gelman, A., Roberts, G., Gilks, W.: Efficient Metropolis jumping rules. In: Bayesian Statistics, vol. 5. Oxford University Press, New York (1995)

    Google Scholar 

  • Geyer, C.J., Thompson, E.A.: Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90, 909–920 (1995)

    Article  MATH  Google Scholar 

  • Ghasemi, A., Sousa, E.S.: An EM-based subspace tracker for wireless communication applications. In: Vehicular Technology Conference. VTC-2005-Fall. IEEE 62nd, pp. 1787–1790 (2005)

  • Gilks, W.R., Roberts, G.O., George, E.I.: Adaptive direction sampling. The Statistician 43, 179–189 (1994)

    Article  Google Scholar 

  • Gilks, W.R., Roberts, G.O., Sahu, S.K.: Adaptive Markov chain Monte Carlo through regeneration. J. Am. Stat. Assoc. 93, 1045–1054 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • Giordani, P., Kohn, R.: Efficient Bayesian inference for multiple change-point and mixture innovation models. Sveriges Riksbank Working Paper No. 196 (2006)

  • Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Green, P.J.: Trans-dimensional Markov chain Monte Carlo. In: Green, P.J., Hjort, N.L., Richardson, S. (eds.) Highly Structured Stochastic Systems. Oxford Statistical Science Series, vol. 27, pp. 179–198. Oxford University Press, London (2003)

    Google Scholar 

  • Green, P.J., Mira, A.: Delayed rejection in reversible jump Metropolis-Hastings. Biometrica 88(3) (2001)

  • Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14(3), 375–395 (1999)

    Article  MATH  Google Scholar 

  • Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Haario, H., Laine, M., Mira, A., Saksman, E.: DRAM: Efficient adaptive MCMC (2003)

  • Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing. J. R. Stat. Soc. Ser. B 66(3), 591–607 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Haario, H., Saksman, E., Tamminen, J.: Componentwise adaptation for high dimensional MCMC. Comput. Stat. 20, 265–274 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie, D.I.: Towards automatic reversible jump Markov chain Monte Carlo. Ph.D. thesis Bristol University, March 2005

  • Holden, L.: Adaptive chains. Tech. Rep. Norwegian Computing Center (1998)

  • Holden, L. et al.: History matching using adaptive chains. Tech. Report Norwegian Computing Center (2002)

  • Kesten, H.: Accelerated stochastic approximation. Ann. Math. Stat. 29(1), 41–59 (1958)

    Article  MATH  MathSciNet  Google Scholar 

  • Kim, S., Shephard, N., Chib, S.: Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65, 361–393 (1998)

    Article  MATH  Google Scholar 

  • Laskey, K.B., Myers, J.: Population Markov chain Monte Carlo. Mach. Learn. 50(1–2), 175–196 (2003)

    Article  MATH  Google Scholar 

  • Liu, J., Liang, F., Wong, W.H.: The use of multiple-try method and local optimization in Metropolis sampling. J. Am. Stat. Assoc. 95, 121–134 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Mykland, P., Tierney, L., Yu, B.: Regeneration in Markov chain samplers. J. Am. Stat. Assoc. 90, 233–241 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Nott, D.J., Kohn, R.: Adaptive sampling for Bayesian variable selection. Biometrika 92(4), 747–763 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Pasarica, C., Gelman, A.: Adaptively scaling the Metropolis algorithm using the average squared jumped distance. Tech. Rep. Department of Statistics, Columbia University (2003)

  • Plakhov, A., Cruz, P.: A stochastic approximation algorithm with step-size adaptation. J. Math. Sci. 120(1), 964–973 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Ramponi, A.: Stochastic adaptive selection of weights in the simulated tempering algorithm. J. Ital. Stat. Soc. 7(1), 27–55 (1998)

    Article  Google Scholar 

  • Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)

    Article  MATH  MathSciNet  Google Scholar 

  • Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (1999)

    MATH  Google Scholar 

  • Roberts, G.O., Rosenthal, J.: Optimal scaling of discrete approximation to Langevin diffusion. J. R. Stat. Soc. B 60, 255–268 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. Technical Report University of Toronto (2006)

  • Roberts, G.O., Rosenthal, J.S.: Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44(2), 458–475 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Roberts, G.O., Gelman, A., Gilks, W.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7, 110–120 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Roweis, S.: EM algorithms for PCA and SPCA. Neural Inf. Process. Syst. 10, 626–632 (1997)

    Google Scholar 

  • Sahu, S.K., Zhigljavsky, A.A.: Adaptation for self regenerative MCMC. Available from http://www.maths.soton.ac.uk/staff/Sahu/research/papers/self.html

  • Saksman, E., Vihola, M.: On the ergodicity of the adaptive Metropolis algorithm on unbounded domains (2008). arXiv:0806.2933

  • Sherlock, C., Roberts, G.O.: Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Tech. Rep. University of Lancaster (2006)

  • Sims, C.A.: Adaptive Metropolis-Hastings algorithm or Monte Carlo kernel estimation. Tech. report Princeton University (1998)

  • Spall, J.C.: Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans. Automat. Control 45(10), 1839–1853 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Stramer, O., Tweedie, R.L.: Langevin-type models II: self-targeting candidates for MCMC algorithms. Methodol. Comput. Appl. Probab. 1(3), 307–328 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  • Tierney, L., Mira, A.: Some adaptive Monte Carlo methods for Bayesian inference. Stat. Med. 18, 2507–2515 (1999)

    Article  Google Scholar 

  • Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 61, 611–622 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction. Stochastic Modelling and Applied Probability. Springer, Berlin (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christophe Andrieu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Andrieu, C., Thoms, J. A tutorial on adaptive MCMC. Stat Comput 18, 343–373 (2008). https://doi.org/10.1007/s11222-008-9110-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-008-9110-y

Keywords

Navigation